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DETAILED ACTION 
Response to Amendment 

1 . In response to the Office Action mailed October 31 , 2006, applicant submitted an 
amendment filed on April 27, 2007, in which the applicant traversed and requested 
reconsideration with respect to independent claims 1, 4, 11, 15 and 22. 



Response to Arguments 

2. Applicants argue that Shpiro et al. is silent on the claimed model phoneme array 
information which includes array of phonemes and word boundaries of sentences to be 
spoken by a learner, and which is used to separate sentence speech information. The 
secondary reference, Brandow et al. does not teach separating sentence speech 
information on the basis of each word including in a sentence using model phoneme 
array information. Brandow is silent on the claimed model phoneme array information 
which includes array of phonemes and word boundaries of sentences to be spoken by a 
learner and which is used to separate sentence speech information. Applicant further 
argues that Acero does not teach using text to separate sentence speech information 
into word speech information based on words included in the text. Applicants 1 
arguments are persuasive, but are moot in view of new grounds of rejections. 
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Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 1-10 and 15-22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Shpiro et al. (U.S. Patent No. 5,487,671), hereinafter referenced as 
Shpiro in view of Brandow et al. (USPN 6,064,957), hereinafter referenced as Brandow 
and in further view of Bruckert (USPN 6,029,131). 

Regarding claims 1, 4, 15 and 22, Shpiro discloses a foreign language learning 
device (figure 2, element 210), method, computer-readable medium and computer 
program (column 1, lines 57-62), hereinafter referenced as a foreign language learning 
device, comprising: 

word separation means (figure 3, element 260) for receiving sentence speech 
information (phonetic unit), the sentence speech information corresponding to speech 
produced successively by a learner (student) when the learner utters a sentence 
(student's utterance; column 7, lines 5-15) including a plurality of words (multiplicity of 
words; column 5, lines 36-40), to separate said sentence speech information (phrases; 
column 5, lines 33-41) into word speech information on the basis of each word included 
in said sentence (column 7, lines 5-15) using model phoneme array information 
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(column 5, lines 33-41) including an array of phonemes and word boundaries of the 
sentence (column 7, lines .16-29 with column 9, lines 24-39); 

likelihood determination means (figure 1, element 40 with figure 2, element 280) 
for evaluating degree of matching (similarity) of each said word speech information with 
a model speech (figure 1 with column 5, lines 10-16 and column 7, lines 43-48); and 

display output means (figure 1 .element 30) for displaying, for each said word, a 
resultant evaluation (figure 1, element 40) determined by said likelihood determination 
means (figure 1 with column 5, lines 10-27), but does not specifically teach a storage 
device and that the word separation means separate said sentence speech information 
on the basis of each word included in said sentence using model phoneme array 
information. 

Brandow teaches word separation means separate said sentence speech 
information (segment speech) on the basis of each word included in said sentence 
using model phoneme array information (column 1, lines 21-35), to recognize the most 
probable words for each group. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Shpiro's device, to include an array of 
phonemes and word boundaries of the sentence, as taught by Brandow, to recognize 
the most probable words for each group by capturing and representing patterns of 
variation of the phonemes into phoneme groups (column 1, lines 21-35). 
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Shpiro in view of Brandow teaches a foreign learning device, but does not 
specifically teach a storage device storing model phoneme array information including 
an array of phonemes and word boundaries of a sentence to be spoken by a learner. 

Bruckert teaches a device comprising a storage device storing model phoneme 
array information (phoneme array) including an array of phonemes and word 
boundaries (word boundaries; column 4, line 50 - column 5, line 55 with column 7, 
lines 1-16) of a sentence to be spoken by a learner (sentences to be spoken; column 2, 
line 22 - column 4, line 9), in order to produce a desired synthetic spoken pattern. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Shpiro in view of Bruckert's device wherein it 
comprises a storage device storing model phoneme array information including an 
array of phonemes and word boundaries of a sentence to be spoken by a learner, as 
taught by Bruckert, so that the sound of the synthesized speech may be produced by 
more accurately timing the rhythm to correspond with rhythm elements of the language 
(column 1, lines 5-42). 

Regarding claims 2, 5, and 16, Shpiro discloses the foreign language learning 
device further comprising storage means (figure 2, element 120) for storing a model 
sentence to be pronounced by said learner (prerecorded speech models) and model 
phoneme array information which corresponds to said model sentence (multiplicity of 
phonemes) and concerns the whole of said model sentence (column 5, lines 33-41 with 
column 8, lines 2-7), wherein 
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said display output means (figure 1, element 30) presents said model sentence 
to said learner in advance (figure 5A), and 

said word separation means (figure 3, element 270) includes 

phoneme recognition means (reference audio for phonemes) for recognizing said 
sentence speech information (words/phrases) on the basis of each phoneme 
information (column 5, lines 33-41 with column 7, lines 5-15 and column 7, line 65 - 
column 8, line 7), and 

word speech recognition means for recognizing said word speech information 
(response specimen) for each said word according to said phoneme information 
(phonetic unit/phoneme) and said model phoneme array information after the 
separation (column 5, lines 33-41 with column 7, lines 5-15 and column 7, line 65 - 
column 8, line 7). 

Regarding claims 3, 6 and 17, Shpiro discloses the foreign language learning 
device wherein 

said phoneme recognition means (figure 1) includes phoneme likelihood 
determination means (figure 1 , element 40 with figure 3, element 280) for determining 
likelihood of each phoneme information (most similarity) in said sentence speech 
information (student's response), with respect to each of phonemes that can be 
included in said foreign language (British/American dialect; column 7, line 34 - column 
8, line 14), and 

said likelihood determination means (figure 3, element 280) evaluates the degree 
of matching of each said word speech information (evaluating the student responses; 
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column 7, lines 33-41) by comparing, on a likelihood distribution plane of phoneme 
information (figure 1 with column 5, lines 10-16 and lines 23-27 with column 7, lines 33- 
48) in said sentence speech information (figure 2, element 520 with column 9, lines 2- 
3), each word likelihood determined along a path followed when pronunciation follows 
a phoneme array exactly the same as said model phoneme array information (column 
5, lines 33-41 and lines 57-65 with column 8, lines 6-14) with the sum of word 
likelihoods (figure 5B, element 530) determined along mistakenly utterable candidate 
paths from a speech waveform of pronunciation by the learner (graphic representations 
of the waveforms; figures 6-11). 

Regarding claims 7 and 18, Shpiro discloses the foreign language learning 
device further comprising the step of evaluating a resultant pronunciation by said 
learner after practice of the pronunciation (audio specimen to be practiced; column 8, 
lines 40-45), said evaluation (evaluating) made on the basis of each said phoneme and 
said word in said model sentence uttered (student's responses) by said learner 
(column 7, lines 34-48). 

Regarding claims 8 and 19, Shpiro discloses the foreign language learning 
method wherein 

said step of evaluating a resultant pronunciation after practice thereof includes 
the step of displaying a vocal tract shape model (graphic representation of the 
waveform) for each said phoneme via a display unit to said learner (figure 1, element- 
30 with figures 6-11). 
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Regarding claims 9 and 20, Shpiro discloses the foreign language learning 
device wherein 

said step of evaluating a resultant pronunciation after practice thereof includes 
the step of displaying, via a display unit (figure 1, element 30) to said learner, a model 
voice print (figure 1 , element 32) and a voice print concerning pronunciation by said 
learner (figure 1, element 34), said voice prints being compared with each other to be 
displayed (figure 1 with column 5, lines 10-16 and column 9, lines 24-35). 

Regarding claims 10 and 21, Shpiro discloses the foreign language learning 
method wherein 

said step of evaluating a resultant pronunciation after practice (figure 1, element 
40) thereof includes the step of displaying, via a display unit (figure 1 , element 30) to 
said learner, position of pronunciation by said learner on a formant plane (figure 1 with 
column 5, lines 61-65 and column 8, lines 2-7). 

5. Claims 11-14 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Shpiro et al. (U.S. Patent No. 5,487,671), hereinafter referenced as Shpiro in view of 
Acero (USPN 6,708,154) and in further view of Bruckert. 

Regarding claim 11, Shpiro discloses a foreign language learning device 
comprising: 

storage means (figure 2, element 120) for storing a model sentence to be 
pronounced by a learner (prerecorded speech models) and model phoneme array 
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information corresponding to said model sentence (multiplicity of phonemes; column 5, 
lines 33-41 with column 8, lines 2-7); 

display output means (figure 1, element 30) for presenting said model sentence 
to said learner in advance (figure 5A); 

word separation means (figure 3, element 260) for receiving sentence speech 
information (phonetic unit) corresponding to a sentence pronounced by said learner 
(student; column 7, lines 5-15) to separate the sentence speech information (phrases; 
column 5, lines 33-41) into word speech information on the basis of each word included 
in said sentence (column 7, lines 5-15); 

likelihood determination means (figure 1, element 40 with figure 2, element 280) 
for evaluating degree of matching (similarity) of each said word speech information with 
a model speech (figure 1 with column 5, lines 10-16 and column 7, lines 43-48) on a 
likelihood distribution plane (column 5, lines 10-16 and lines 23-27 with lines 33-48); 
and 

display output means (figure 1 , element 30) for displaying, for each phoneme and 
each said word, a resultant evaluation (figure 1, element 40) by said likelihood 
determination means (figure 1 with column 5, lines 10-27), 

said word separation means (figure 3, element 270) including 
phoneme recognition means (reference audio for phonemes) for recognizing said 
sentence speech information (word/ phrases) on the basis of each phoneme information 
(column 5, lines 33-41 with column 7, lines 5-15 and column 7, line 65 - column 8, line 
7), and 
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word speech recognition means for recognizing said word speech information 
(response specimen) for each said word according to said phoneme information 
(phonetic unit/phoneme) and said model phoneme array information after the 
separation (column 7, lines 5-15 and column 7, line 65 - column 8, line 7), and 

said foreign language learning device further comprising pronunciation evaluation 
means for evaluating a resultant pronunciation after practice of the pronunciation 
(audio specimen to be practices; column 8, lines 40-45) for each said phoneme and for 
each said word in said model sentence uttered (student's responses) by said learner in 
a pronunciation practice period (column 7, lines 34-48), but does not specifically teach 
that the a storage device and word separation means include an array of phonemes 
and word boundaries of the sentence. 

Acero teaches a word separation means (figure 4, element 294) to separate said 
sentence speech information into word speech information on the basis of each word 
included in said sentence using model phoneme array information including an array of 
phonemes and word boundaries of the sentence (column 5, lines 11-20), to provide 
better smoothing during formant tracking. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Shpiro's device, to include an array of 
phonemes and word boundaries of the sentence, as taught by Acero, to identify the 
most likely sequence of formant groups based on training text (column 5, lines 11-20). 



Application/Control Number: 09/936,365 Page 11 

Art Unit: 2626 

Shpiro in view of Acerb teaches a foreign learning device, but does not 
specifically teach a storage device storing model phoneme array information including 
an array of phonemes and word boundaries of a sentence to be spoken by a learner. 

Bruckert teaches a device comprising a storage device storing model phoneme 
array information (phoneme array) including an array of phonemes and word 
boundaries (word boundaries; column 4, line 50 - column 5, line 55 with column 7, 
lines 1-16) of a sentence to be spoken by a learner (sentences to be spoken; column 2, 
line 22 - column 4, line 9), in order to produce a desired synthetic spoken pattern. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Shpiro in view of Acero's device wherein it 
comprises a storage device storing model phoneme array information including an 
array of phonemes and word boundaries of a sentence to be spoken by a learner, as 
taught by Bruckert, so that the sound of the synthesized speech may be produced by 
more accurately timing the rhythm to correspond with rhythm elements of the language 
(column 1, lines 5-42). 

Regarding claim 12, Shpiro discloses the foreign language learning method 
wherein 

said step of evaluating a resultant pronunciation after practice thereof includes 
the step of displaying a vocal tract shape model (graphic representation of the 
waveform) for each said phoneme via a display unit to said learner (figure 1, element 
30 with figures 6-11). 
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Regarding claim 13, Shpiro discloses the foreign language learning device 
wherein 

said step of evaluating a resultant pronunciation after practice thereof includes 
the step of displaying, via a display unit (figure 1, element 30) to said learner, a model 
voice print (figure 1, element 32) and a voice print concerning pronunciation by said 
learner (figure 1, element 34), said voice prints being compared with each other to be 
displayed (figure 1 with column 5, lines 10-16 and column 9, lines 24-35). 

Regarding claim 14, Shpiro discloses the foreign language learning method 
wherein 

said step of evaluating a resultant pronunciation after practice (figure 1, element 
40) thereof includes the step of displaying, via a display unit (figure 1 , element 30) to 
said learner, position of pronunciation by said learner on a formant plane (figure 1 with 
column 5, lines 61-65 and column 8, lines 2-7). 

Conclusion 

6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jakieda R. Jackson whose telephone number is 571- 
272-7619. The examiner can normally be reached on Monday, Tuesday and Thursday 
7:30 a.m. to 5:00p.m. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on 571-272-7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 



For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

JRJ 

July 19, 2007 / 



DAVID HUDSPETH 
SUPERVISORY PATENT EXAMINER 
TECHNOLOGY CENTER 2600 




